Huddersfield
Essay cheating at universities an 'open secret'
A BBC investigation has uncovered claims that essay cheating remains widespread at UK universities despite the introduction of a law designed to stop it. Since April 2022, it has been illegal to provide essays for students in post-16 education in England. But so far there have been no prosecutions. The BBC has spoken to a former lecturer who describes essay cheating as an open secret and to a businessman who claims to have made millions from selling model answer essays to university students. Universities UK, which represents 141 institutions, said there were severe penalties for students caught submitting work that was not their own.
- North America > United States (0.15)
- North America > Central America (0.14)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.05)
- (17 more...)
PhreshPhish: A Real-World, High-Quality, Large-Scale Phishing Website Dataset and Benchmark
Dalton, Thomas, Gowda, Hemanth, Rao, Girish, Pargi, Sachin, Khodabakhshi, Alireza Hadj, Rombs, Joseph, Jou, Stephan, Marwah, Manish
Phishing remains a pervasive and growing threat, inflicting heavy economic and reputational damage. While machine learning has been effective in real-time detection of phishing attacks, progress is hindered by lack of large, high-quality datasets and benchmarks. In addition to poor-quality due to challenges in data collection, existing datasets suffer from leakage and unrealistic base rates, leading to overly optimistic performance results. In this paper, we introduce PhreshPhish, a large-scale, high-quality dataset of phishing websites that addresses these limitations. Compared to existing public datasets, PhreshPhish is substantially larger and provides significantly higher quality, as measured by the estimated rate of invalid or mislabeled data points. Additionally, we propose a comprehensive suite of benchmark datasets specifically designed for realistic model evaluation by minimizing leakage, increasing task difficulty, enhancing dataset diversity, and adjustment of base rates more likely to be seen in the real world. We train and evaluate multiple solution approaches to provide baseline performance on the benchmark sets. We believe the availability of this dataset and benchmarks will enable realistic, standardized model comparison and foster further advances in phishing detection. The datasets and benchmarks are available on Hugging Face (https://huggingface.co/datasets/phreshphish/phreshphish).
- North America > United States (0.14)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.04)
Advancing Offline Handwritten Text Recognition: A Systematic Review of Data Augmentation and Generation Techniques
Rassul, Yassin Hussein, Ahmed, Aram M., Fattah, Polla, Hassan, Bryar A., Abdulkareem, Arwaa W., Rashid, Tarik A., Lu, Joan
Offline Handwritten Text Recognition (HTR) systems play a crucial role in applications such as historical document digitization, automatic form processing, and biometric authentication. However, their performance is often hindered by the limited availability of annotated training data, particularly for low-resource languages and complex scripts. This paper presents a comprehensive survey of offline handwritten data augmentation and generation techniques designed to improve the accuracy and robustness of HTR systems. We systematically examine traditional augmentation methods alongside recent advances in deep learning, including Generative Adversarial Networks (GANs), diffusion models, and transformer-based approaches. Furthermore, we explore the challenges associated with generating diverse and realistic handwriting samples, particularly in preserving script authenticity and addressing data scarcity. This survey follows the PRISMA methodology, ensuring a structured and rigorous selection process. Our analysis began with 1,302 primary studies, which were filtered down to 848 after removing duplicates, drawing from key academic sources such as IEEE Digital Library, Springer Link, Science Direct, and ACM Digital Library. By evaluating existing datasets, assessment metrics, and state-of-the-art methodologies, this survey identifies key research gaps and proposes future directions to advance the field of handwritten text generation across diverse linguistic and stylistic landscapes.
- Asia > Middle East > Iraq > Erbil Governorate > Erbil (0.04)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.04)
- North America > United States (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Overview (1.00)
Hybrid Adversarial Spectral Loss Conditional Generative Adversarial Networks for Signal Data Augmentation in Ultra-precision Machining Surface Roughness Prediction
Shang, Suiyan, Cheung, Chi Fai, Zheng, Pai
Accurate surface roughness prediction in ultra-precision machining (UPM) is critical for real-time quality control, but small datasets hinder model performance. We propose HAS-CGAN, a Hybrid Adversarial Spectral Loss CGAN, for effective UPM data augmentation. Among five CGAN variants tested, HAS-CGAN excels in 1D force signal generation, particularly for high-frequency signals, achieving >0.85 wavelet coherence through Fourier-domain optimization. By combining generated signals with machining parameters, prediction accuracy significantly improves. Experiments with traditional ML (SVR, RF, LSTM) and deep learning models (BPNN, 1DCNN, CNN-Transformer) demonstrate that augmenting training data with 520+ synthetic samples reduces prediction error from 31.4% (original 52 samples) to ~9%, effectively addressing data scarcity in UPM roughness prediction."
- Asia > China > Hong Kong (0.04)
- North America > United States > Oklahoma (0.04)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.04)
- (2 more...)
- Semiconductors & Electronics (0.68)
- Information Technology (0.46)
The Impact of Generative AI Coding Assistants on Developers Who Are Visually Impaired
Flores-Saviaga, Claudia, Hanrahan, Benjamin V., Imteyaz, Kashif, Clarke, Steven, Savage, Saiph
The rapid adoption of generative AI in software development has impacted the industry, yet its effects on developers with visual impairments remain largely unexplored. To address this gap, we used an Activity Theory framework to examine how developers with visual impairments interact with AI coding assistants. For this purpose, we conducted a study where developers who are visually impaired completed a series of programming tasks using a generative AI coding assistant. We uncovered that, while participants found the AI assistant beneficial and reported significant advantages, they also highlighted accessibility challenges. Specifically, the AI coding assistant often exacerbated existing accessibility barriers and introduced new challenges. For example, it overwhelmed users with an excessive number of suggestions, leading developers who are visually impaired to express a desire for ``AI timeouts.'' Additionally, the generative AI coding assistant made it more difficult for developers to switch contexts between the AI-generated content and their own code. Despite these challenges, participants were optimistic about the potential of AI coding assistants to transform the coding experience for developers with visual impairments. Our findings emphasize the need to apply activity-centered design principles to generative AI assistants, ensuring they better align with user behaviors and address specific accessibility needs. This approach can enable the assistants to provide more intuitive, inclusive, and effective experiences, while also contributing to the broader goal of enhancing accessibility in software development.
- North America > Mexico (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.04)
- (3 more...)
NaijaNLP: A Survey of Nigerian Low-Resource Languages
With over 500 languages in Nigeria, three languages -- Hausa, Yor\`ub\'a and Igbo -- spoken by over 175 million people, account for about 60% of the spoken languages. However, these languages are categorised as low-resource due to insufficient resources to support tasks in computational linguistics. Several research efforts and initiatives have been presented, however, a coherent understanding of the state of Natural Language Processing (NLP) - from grammatical formalisation to linguistic resources that support complex tasks such as language understanding and generation is lacking. This study presents the first comprehensive review of advancements in low-resource NLP (LR-NLP) research across the three major Nigerian languages (NaijaNLP). We quantitatively assess the available linguistic resources and identify key challenges. Although a growing body of literature addresses various NLP downstream tasks in Hausa, Igbo, and Yor\`ub\'a, only about 25.1% of the reviewed studies contribute new linguistic resources. This finding highlights a persistent reliance on repurposing existing data rather than generating novel, high-quality resources. Additionally, language-specific challenges, such as the accurate representation of diacritics, remain under-explored. To advance NaijaNLP and LR-NLP more broadly, we emphasise the need for intensified efforts in resource enrichment, comprehensive annotation, and the development of open collaborative initiatives.
- Africa > Niger (0.14)
- Africa > Cameroon (0.14)
- Africa > Nigeria > Jigawa State > Dutse (0.05)
- (29 more...)
- Research Report (1.00)
- Overview (1.00)
- Health & Medicine (1.00)
- Information Technology > Security & Privacy (0.46)
- Media > News (0.46)
- (2 more...)
YOLOv12: A Breakdown of the Key Architectural Features
Alif, Mujadded Al Rabbani, Hussain, Muhammad
This paper presents an architectural analysis of YOLOv12, a significant advancement in single-stage, real-time object detection building upon the strengths of its predecessors while introducing key improvements. The model incorporates an optimised backbone (R-ELAN), 7x7 separable convolutions, and FlashAttention-driven area-based attention, improving feature extraction, enhanced efficiency, and robust detections. With multiple model variants, similar to its predecessors, YOLOv12 offers scalable solutions for both latency-sensitive and high-accuracy applications. Experimental results manifest consistent gains in mean average precision (mAP) and inference speed, making YOLOv12 a compelling choice for applications in autonomous systems, security, and real-time analytics. By achieving an optimal balance between computational efficiency and performance, YOLOv12 sets a new benchmark for real-time computer vision, facilitating deployment across diverse hardware platforms, from edge devices to high-performance clusters.
LLM-based event log analysis techniques: A survey
Akhtar, Siraaj, Khan, Saad, Parkinson, Simon
Event log analysis is an important task that security professionals undertake. Event logs record key information on activities that occur on computing devices, and due to the substantial number of events generated, they consume a large amount of time and resources to analyse. This demanding and repetitive task is also prone to errors. To address these concerns, researchers have developed automated techniques to improve the event log analysis process. Large Language Models (LLMs) have recently demonstrated the ability to successfully perform a wide range of tasks that individuals would usually partake in, to high standards, and at a pace and degree of complexity that outperform humans. Due to this, researchers are rapidly investigating the use of LLMs for event log analysis. This includes fine-tuning, Retrieval-Augmented Generation (RAG) and in-context learning, which affect performance. These works demonstrate good progress, yet there is a need to understand the developing body of knowledge, identify commonalities between works, and identify key challenges and potential solutions to further developments in this domain. This paper aims to survey LLM-based event log analysis techniques, providing readers with an in-depth overview of the domain, gaps identified in previous research, and concluding with potential avenues to explore in future.
- North America > United States > New York > New York County > New York City (0.04)
- South America > Brazil (0.04)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.04)
- Asia > Middle East > Yemen > Amanat Al Asimah > Sanaa (0.04)
- Research Report > New Finding (1.00)
- Overview (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.67)
Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering
Zhou, Wei, Mesgar, Mohsen, Friedrich, Annemarie, Adel, Heike
Complex table question answering (TQA) aims to answer questions that require complex reasoning, such as multi-step or multi-category reasoning, over data represented in tabular form. Previous approaches demonstrated notable performance by leveraging either closed-source large language models (LLMs) or fine-tuned open-weight LLMs. However, fine-tuning LLMs requires high-quality training data, which is costly to obtain, and utilizing closed-source LLMs poses accessibility challenges and leads to reproducibility issues. In this paper, we propose Multi-Agent Collaboration with Tool use (MACT), a framework that requires neither closed-source models nor fine-tuning. In MACT, a planning agent and a coding agent that also make use of tools collaborate to answer questions. Our experiments on four TQA benchmarks show that MACT outperforms previous SoTA systems on three out of four benchmarks and that it performs comparably to the larger and more expensive closed-source model GPT-4 on two benchmarks, even when using only open-weight models without any fine-tuning. We conduct extensive analyses to prove the effectiveness of MACT's multi-agent collaboration in TQA.
- North America > Canada > Saskatchewan > Saskatoon (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (26 more...)
- Research Report (1.00)
- Financial News (0.68)
- Transportation > Passenger (1.00)
- Leisure & Entertainment > Sports > Soccer (1.00)
- Transportation > Air (0.93)
- Consumer Products & Services > Travel (0.93)
PV-faultNet: Optimized CNN Architecture to detect defects resulting efficient PV production
Zaman, Eiffat E, Khanam, Rahima
The global shift towards renewable energy has pushed PV cell manufacturing as a pivotal point as they are the fundamental building block of green energy. However, the manufacturing process is complex enough to lose its purpose due to probable defects experienced during the time impacting the overall efficiency. However, at the moment, manual inspection is being conducted to detect the defects that can cause bias, leading to time and cost inefficiency. Even if automated solutions have also been proposed, most of them are resource-intensive, proving ineffective in production environments. In that context, this study presents PV-faultNet, a lightweight Convolutional Neural Network (CNN) architecture optimized for efficient and real-time defect detection in photovoltaic (PV) cells, designed to be deployable on resource-limited production devices. Addressing computational challenges in industrial PV manufacturing environments, the model includes only 2.92 million parameters, significantly reducing processing demands without sacrificing accuracy. Comprehensive data augmentation techniques were implemented to tackle data scarcity, thus enhancing model generalization and maintaining a balance between precision and recall. The proposed model achieved high performance with 91\% precision, 89\% recall, and a 90\% F1 score, demonstrating its effectiveness for scalable quality control in PV production.
- North America > United States (0.14)
- Europe > United Kingdom > England > West Yorkshire > Huddersfield (0.04)
- Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)